[% setvar title TITLE %]
<div id="archive-notice">
    <h3>This file is part of the Perl 6 Archive</h3>
    <p>To see what is currently happening visit <a href="http://www.perl6.org/">http://www.perl6.org/</a></p>
</div>
<div class='pod'>
<a name='TITLE'></a><h1>TITLE</h1>
<p>Assignment within a regex</p>
<a name='VERSION'></a><h1>VERSION</h1>
<pre>  Maintainer: Richard Proctor &lt;<a href='mailto:richard@waveney.org'>richard@waveney.org</a>&gt;
  Date: 16 Aug 2000
  Last Modified: 1 Oct 2000
  Mailing List: <a href='mailto:perl6-language-regex@perl.org'>perl6-language-regex@perl.org</a>
  Number: 112
  Version: 4
  Status: Frozen</pre>
<a name='ABSTRACT'></a><h1>ABSTRACT</h1>
<p>Provide a simple way of naming and picking out information from a regex
without having to count the brackets.</p>
<a name='DESCRIPTION'></a><h1>DESCRIPTION</h1>
<p>If a regex is complex, counting the bracketed sub-expressions to find the
ones you wish to pick out can be messy.  It is also prone to maintainability
problems if and when you wish to add to the expression.  Using (?:) can be
used to surpress picking up brackets, it helps, but it still gets &quot;complex&quot;.
I would sometimes rather just pickout the bits I want within the regex itself.</p>
<p>Suggested syntax: (?$foo= ... ) would assign the string that is matched by
the patten ... to $foo when the patten matches.  These assignments would be
made left to right after the match has succeded but before processing a
replacement or other results (or prior to a some (?{...}) or (??{...})
code).  There may be whitespace between the $foo and the &quot;=&quot;.</p>
<p>Potentially the $foo could be any scalar LHS, as in (?$foo{$bar}= ... ),
likewise the '=' could be any asignment operator.</p>
<p>The camel and the docs include this example:</p>
<pre>   if (/Time: (..):(..):(..)/) {
        $hours = $1;
        $minutes = $2;
        $seconds = $3;
    }
        </pre>
<p>This then becomes:</p>
<pre>  /Time: (?$hours=..):(?$minutes=..):(?$seconds=..)/</pre>
<p>This is more maintainable than counting the brackets and easier to understand
for a complex regex.  And one does not have to worry about the scope of $1
etc.</p>
<a name='When does the assignment actually happen?'></a><h2>When does the assignment actually happen?</h2>
<p>In general all assignments should wait to the very end, and then assign
them all.  However before code callouts (?{...}) and friends, the named
assignments that are currently defined should be made so that
the code can refer to them by name.</p>
<p>It may be appropriate for any assignments made before a code callout
to be localised so they can unrolled should the expression finally fail.</p>
<a name='Named Backrefs'></a><h2>Named Backrefs</h2>
<p>The first versions of this RFC did not allow for backrefs.  I now think this
was a shortcoming.  It can be done with (??{quotemeta $foo}), but I find this
clumsy, a better way of using a named back ref might be (?\$foo).</p>
<a name='Scoping'></a><h2>Scoping</h2>
<p>The question of scoping for these assignments has been raised, but I don't
currently have a feel for the &quot;best&quot; way to handle this.  Input welcome.</p>
<p>Hugo: I think it should be defined to act the same as in (??{...}), whenever
we get around to defining that.</p>
<a name='Brackets'></a><h2>Brackets</h2>
<p>Using this method for capturing wanted content, it might be desirable to stop
ordinary brackets capturing, and needing to use (?:...).  I therefore suggest
that as an enhancement to regexes that /b (bracket?) ordinary brackets just
group, without capture - in effect they all behave as (?:...).</p>
<a name='CHANGES'></a><h1>CHANGES</h1>
<p>V3 - added bit about backrefs, and brackets.</p>
<p>V4 - Clarified a few things and froze</p>
<a name='IMPLENTATION'></a><h1>IMPLENTATION</h1>
<p>Currently all $scalars in regexes are expanded before the main regex compiler
gets to analyse the syntax.  This problem also affects several other RFCs
(166 for example).  The expansion of variables in regexes needs for these
(and other RFCs) to be driven from within the regex compiler so that the
regex can expand as and where appropriate.  Changing this should not affect
any existing behaviour.</p>
<a name='REFERENCES'></a><h1>REFERENCES</h1>
<p>I brought this up on p5p a couple of years ago, but it was lost in the
noise...</p>
<p>RFC 166</p>
<p>Perlstorm #0040</p>
</div>
